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"Leaping" methods show great promise for significantly accelerating stochastic simulations of com- 
plex biochemical reaction networks. However, few practical applications of leaping have appeared in 
the literature to date. Here, we address this issue using the "partitioned leaping algorithm" (PLA) 
[L.A. Harris and P. Clancy, J. Chem. Phys. 125, 144107 (2006)], a recently- introduced multiscale 
leaping approach. We use the PLA to investigate stochastic effects in two model biochemical re- 
action networks. The networks that we consider are simple enough so as to be accessible to our 
intuition but sufficiently complex so as to be generally representative of real biological systems. 
We demonstrate how the PLA allows us to quantify subtle effects of stochasticity in these systems 
that would be difficult to ascertain otherwise as well as not-so-subtle behaviors that would strain 
commonly-used "exact" stochastic methods. We also illustrate bottlenecks that can hinder the ap- 
proach and exemplify and discuss possible strategies for overcoming them. Overall, our aim is to 
aid and motivate future applications of leaping by providing stark illustrations of the benefits of the 
method while at the same time elucidating obstacles that are often encountered in practice. 



PACS numbers: 82.39.-k, 87.10.Rt, 87.18.Tt, 87.18.V1 



I. INTRODUCTION 

Biological systems are inherently noisy, or stochastic. 
A primary source of this noise is the random nature of 
molecular interactions that predominates when molec- 
ular copy numbers are low, so-called "intrinsic" noise 
[U, i, II, 13, i, i, 0, II- Intrinsic noise has been impli- 
cated as the source of non-genetic variability in clonal 
cell populations d, [l^, [ll| and can profoundly affect the 
dynamical behavior of a biological system, both to its 
benefit as well as its detriment 2J. For example, stochas- 
tic effects in gene expression have been shown to speed 
the response of yeast cells to a challenge, allowing them 
to survive in conditions where they otherwise could not 
[l^ . Conversely, noise can interfere with the workings 
of circadian clocks [H, , systems whose reliability is 
essential for survival. As such. Nature has developed reg- 
ulatory mechanisms to attenuate the noise All m 
all, accounting for the effects of stochasticity is essential 
for gaining a clear understanding of the design principles 
underlying many biological processes. 

Nevertheless, biological-systems modeling today con- 
tinues to be performed primarily at the continuous- 
deterministic scale, usually in the form of ordinary dif- 
ferential equations (ODEs). These formalisms explicitly 
ignore stochastic fluctuations [13, 13 • However, the 
practice understandably persists because the methods 
are well established and numerous easy-to-use algorithms 
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are available at little to no cost. Moreover, established 
stochastic methods, such as Gillespie's stochastic simu- 
lation algorithm (SSA) [H, |2^ l2i|, are in many cases 
simply too computationally intensive to apply to realis- 
tic models of biological networks [12, [1^ . 

There is great interest, therefore, in developing accel- 
erated stochastic simulation methods that can accurately 
capture noise effects but at significantly reduced compu- 
tational cost relative to standard approaches. Ultimately, 
it is hoped that these methods will supplant ODEs as the 
default method of choice in computational systems biol- 
ogy. Approaches that have been proposed in this regard 
include (but are not limited to): (i) modification and op- 
timization of Gillespie's original SSA [13,111,121,1231, (ii) 
"leaping" methods which ignore the exact moments at 
which reaction firings occur feS, 28, 29, 30, 31, 32, 33, sl, 
|3l,|3i,lMl38|,l3i,l40,ISIii,|43|,IMl4^ and (iii) "hybrid" 
methods which couple different simulation techniques 
(e.g., the SSA and ODEs) into a single, overarching al- 
gorithmic framework [H, [^ IH, H iHl, HH, [13, [H ll. 
Of these, leaping methods are particularly popular, pre- 
sumably because of the sound theoretical foundation on 
which they stand [H, [111 . 

Despite their popularity, however, very few practical 
applications of leaping have appeared in the literature 
to date [111. This is a curious fact that has yet to be 
fully explored or explained. Moreover, those applications 
that have appeared (e.g., [H, [13, [111) are generally brief 
in their presentation of the algorithms used and do not 
report much by way of the capabilities and limitations 
of the method. Thus, it is difficult, if not impossible, to 
project the potential utility of the approach onto other 
and more complex biological networks. 

Here, we address this issue by using the "partitioned 
leaping algorithm" (PLA) [ill , a recently-introduced ex- 



2 



tension and variant of the r-leaping method of Gillespie 
and co-workers [H, [H, [H, [s^l , to systematically inves- 
tigate the effects of stochasticity in two model biochemi- 
cal reaction networks. The systems that we consider are 
intuitively simple yet they contain attributes that are 
ubiquitous to complex biological networks, such as en- 
zyme catalysis and feedback control. We perform a de- 
tailed and in-depth investigation using leaping with the 
aim of illuminating both the capabilities and limitations 
of the method and, hence, aiding and motivating future 
applications of the approach. The PLA operates on the 
same basic principles that underlie all leaping algorithms 
and its performance with respect to these systems can 
thus be seen as generally reflective of the entire class of 
method. That being said, there are certain aspects of the 
PLA, which we will expound upon below, that make it 
particularly appealing from a practical point of view. 

We begin in Sec. |lT]by introducing the biochemical re- 
action networks that we investigate in this article. We 
then briefly describe in Sec. Illll the PLA, the time series 
analysis method and the statistical tests employed in this 
work. Detailed results for the two networks are subse- 
quently presented in Sec. lIVI We conclude in Sec. IVl with 
a discussion of the implications of these results, a possi- 
ble explanation for why practical applications of leaping 
are so scarce in the literature and the future outlook for 
leaping methods in computational systems biology. 



II. THE NETWORKS 

The systems that we investigate are a core model for 
calcium oscillations in hepatocytes introduced by Kum- 
mer et al. [S^ and the three-gene "repressilator" of 
Elowitz and Leibler [l^. These systems are relatively 
simple, yet they are not "toy" problems in the sense 
that they contain non-trivial features that are ubiqui- 
tous to biochemical systems, such as enzyme catalysis 
and feedback control. Moreover, both systems emit large- 
amplitude oscillations which give rise to the kinds of wide 
disparities in species populations that leaping algorithms 
are specifically designed to cope with [23, 45]. All in all, 
these systems provide an ideal testbed for investigating 
the practical utility of leaping methods in computational 
systems biology. 

Our investigation entails using the PLA to probe be- 
havioral changes that arise in these systems due to 
changes in various system properties. Specifically, we in- 
vestigate the transition from stochastic to deterministic 
behavior that accompanies increases in the system vol- 
ume (i.e., population levels) in the calcium-oscillations 
model and increases in the gene-protein binding and un- 
binding rate constants in the repressilator. The salient 
feature of our investigation is that we are able to ascer- 
tain, in a systematic way, the performance characteristics 
of the leaping algorithm over a wide spectrum of condi- 
tions. We thus identify cases where leaping proves par- 
ticularly beneficial, where it "bogs down," and various 



points in between. Further details of the networks are 
provided in the subsections below. 



A. Calcium Oscillations 

Intracellular calcium is an important second messen- 
ger for the functioning of many cell types, both in plants 
and in animals. It is involved in a multitude of functions 
during the lifetime of a cell, including fertilization, devel- 
opment and death [60| . The dynamics of intracellular cal- 
cium are not smooth and continuous, however. Rather, 
they are driven by small numbers of receptors and ion 
channels that can give rise to highly stochastic behav- 
ior. Indeed, experiments have shown that calcium waves 
are triggered by elementary stochastic events known as 
"blips" and "puffs" ^61.] . Incorporating stochasticity into 
models of calcium oscillations is thus of high interest. 

Many theoretical models have been proposed to de- 
scribe the oscillatory dynamics of intracellular calcium 
[6ll[6^. Kummer et al. [5^ proposed a model for calcium 
oscillations in hepatocytes (liver cells) that displays a rich 
variety of behaviors. The model features self-enhanced 
activation of the Ga subunit of the receptor complex 
and is able to capture many aspects of experimentally- 
observed behavior that eluded previous models. The au- 
thors also presented a simplified version of the model that 
displays the same basic behaviors as the full model, thus 
emphasizing the "core" mechanisms driving the oscilla- 
tions [sg^. 

In Table [H we show the Kummer et al. [5^ core model 
for calcium oscillations in hepatocytes. The model con- 
sists of eight reactions involving three species: the acti- 
vated form of phospholipase C (PLC*), the a subunit of 
the receptor-bound G protein {Ga) and cytosolic calcium 
ions (Ca). Note that the model is in a reduced form, with 
degradation processes described in terms of Michaelis- 
Menten kinetics. Reaction 2, which is the prime fea- 
ture of this model, describes the agonist-initiated [e.g., 
adenosine triphosphate (ATP)] autocatalytic activation 
of the Ga subunit. The parameter ^2 thus amounts to 
the product of the second-order association constant and 
the agonist concentrati on and is a prim ary determinant 
of the system behavior. iKummer et al.l showed that with 
increasing ^2 the system behavior transitions from sim- 
ple Ca^+ spiking oscillations, to complex oscillations, to 
chaotic behavior and, finally, to an elevated steady state 

Hi,!!!. 

We also see in Table |T] that the model contains various 
feedback loops which drive the oscillatory behavior of the 
network. Specifically, PLC* and Ca are created autocat- 
alytically in reactions 5 and 7, respectively, through the 
action of Ga- In reactions 3 and 4, however, Ga is de- 
graded enzymatically by the actions of PLC* and Ca, 
respectively. Thus, in the correct parameter range, in- 
creased levels of Ga lead to increased levels of PLC* and 
Ca which, in turn, lead to increased degradation of Gq, 
which leads to decreased levels of PLC* and Ca, and so 
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TABLE L Kummer et aL [s^ core model for calcium oscillations in hepatocytes. '0' represents a source or a sink and 
k2 — 2.85 s~^ puts the system into the "periodic-bursting" regime (see text). Ga represents the activated a-subunit of the 
intracellular receptor-bound G-protein, PLC* the activated form of phospholipase C, and Ca cytosolic calcium ions. Note that 
to perform stochastic simulations all parameters must be devoid of molar units (M). Parameters with molar units are thus 
multiplied by Na^ (Avogadro's number x system volume) prior to runtime. 



Reaction 


Rate Expression 


Parameter Value(s) 


f. 


0^G<, 




fci = 0.2f2 M s"^ 


2. 


Ga — > 2Ga 


k2[Ga\ 


fca = 2.85 s-i 


3. 


Ga+ PLC* -> PLC* 


fc3[GJ[PLC*]/(iC4-F[Gc]) 


fcs = f.52 s-\ K4, = 0.f9 M 


4. 




k5[Ga][CA]/{K(, + [Ga]) 


fcs = 4.88 s-\ Ka = 1.18 M 


5. 


Get G OL ~\~ PLC 


k7[Ga] 


fcr = 1.24 s-i 


6. 


PLC* 


fc8[PLC*]/(if9-f [PLC*]) 


fcg = 32.24 M s-\ = 29.09 M 


7. 


GoL " ^ GoL H~ CcL 


kw[Ga] 


fcio = 13.58 s-i 


8. 


Ca-> 


fcn[Ca]/(A:i2-f [Ca]) 


fell = 153.0 M s-\ Ku = 0.16 M 



Initial conditions: [Ga] = [PLC*] = [Ca] = 0.01 M 



on and so forth; 

In Ref. [13,11 ummer et aLl compared the deterministic 
behavior of this model to results of stochastic simulations 
performed using the SSA. The goal was to determine 
points of transition to determinism for various dynamical 
regimes of the model (e.g., "periodic spiking," "periodic 
bursting," "chaos") and to provide general insight as to 
when a deterministic treatment is applicable and when a 
stochastic approach is necessary. SSA simulations were 
performed for various system sizes (with fixed concen- 
trations) and the point of transition to determinism was 
estimated via visual comparison of stochastic and deter- 
ministic time courses. Visual inspection was necessary 
because of the high computational expense of the SSA 

Here, we extend the analysis of IKummer et al.l for the 
"periodic-bursting" regime, a main focus of Ref. [s^ - 
The regime is characterized by complex Ca^+ oscillations 
comprised of three-peak complexes (see below) , behavior 
that is reminiscent of that seen exp erimentally in hepa- 
tocytes stimulated by ATP [H,!!!!!!]. Using the PLA 
and the peak-analysis tool described in Sec. IHI[ we col- 
lect large amounts of peak amplitude and peak-to-peak 
distance data at various system volumes and quantify the 
relationship between stochasticity and system size, some- 
thing that was not feasible in Ref. [6J] because of the 
limitations of the SSA. This allows us to pinpoint, from 
a statistical perspective, the points of transition to de- 
terminism. As we shall see, these differ, to some extent, 
from those reported in [65 ]. 



B. Repressilator 

Synthetic biology is a relatively new and rapidly grow- 
ing scientific field [651 [66l . \&A iGSj . In analogy with elec- 
trical circuit design, synthetic biologists attempt to use 
their knowledge of fundamental biological principles to 
design and construct artificial biological "circuits" that 



confer novel function unto their host. In this way, one 
can isolate and control specific aspects of a biological pro- 
cess and circumvent the immense complexity of natural 
biological systems, providing a means by which current 
theoretical understanding can be tested and scrutinized. 
Moreover, the long-term goal is to develop protocols for 
logical control. One can envision a time when microor- 
ganisms are "programmed" at the genetic level to carry 
out important functions, such as cleaning up oil spills or 
delivering tumor-suppressing drugs to specific locations 
within the body [65, 66]. 

Numerous artificial biological circuits have been con- 
structed in bacteria and demonstrated to perform as de- 
signed. One such network is the repressilator, a three- 
gene synthetic genetic regulatory network developed by 
Elowitz and Leibler [T3]. Each gene in the repressilator 
produces a protein which represses the next gene in the 
sequence; the protein product of the last gene represses 
the first gene, thus closing the loop. This construct is 
known in microelectronics as a "ring oscillator" I6al . As 
implemented experimentally in Escherichia coli [l3| , the 
repressilator consists of the genes lad, tetR, and A-c/; 
Lad protein represses tetR, etc. (Fig. [J). [Standard con- 
vention is to denote genes in italicized font beginning 
with a lower-case letter (e.g., tetR), mRNA transcripts 
in non-italicized font beginning with a lower-case letter 
(e.g., tetR) and proteins in non- italicized font beginning 
with a capital letter (e.g., TetR).] 

Under the right conditions, i.e., within the correct re- 
gion of parameter space, the repressilator oscillates, act- 
ing as a biological clock. However, determining the con- 
ditions for oscillation is nontrivial and theoretical mod- 
eling was employed to identify the appropriate design 
criteria jl3| . Once functional, a particularly interesting 
experimental observation was the significant fiuctuations 
in amplitude and period exhibited by the circuit. Nat- 
ural oscillators, such as circadian clocks, do not exhibit 
such variability p^ . [6^ pTOj and subsequent modeling in- 
dicated that Nature must employ some form of regulatory 
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FIG. 1: Schematic diagram of the repressilator. Each gene 
{lad , tetR, X-cI) produces a protein which binds to the oper- 
ator site of the promoter driving expression of the next gene in 
the sequence, thus repressing it. Within the correct region of 
parameter space the repressilator oscillates, a so-called "ring 
oscillator" iGol. 



control in order to overcome the problem [ij, [l5| . The 
repressilator thus succeeded in providing valuable insight 
regarding the design principles underlying an important 
biological process. 

The extensive use of modeling in the design and anal- 
ysis of the repressilator, as well as the highly stochastic 
behavior exhibited by the network, motivates our inves- 
tigation using the PLA. In Table |lTl we show the basic 
form of one-third of the repressilator model (all three 
genes are considered equivalent). This corresponds to 
the "stochastic" model of Elowitz and Leibler [l^]. Here, 
all reactions are treated as elementary using simple mass- 
action kinetics (i.e., rates directly proportional to the re- 
actant population levels). Each gene is assumed to have 
two binding sites for repressor protein, with binding oc- 
curring sequentially, and the unbound gene transcribes 
mRNA 1000 times faster than the singly- or doubly- 
bound gene. mRNA also translates protein autocatalyt- 
ically and mRNA and protein degrade with half-lives of 
120 and 600 s, respectively [l^. 

We also include in Table |TT] various multiplicative fac- 
tors: a "telegraph factor" 7, an "RNA factor" ry and a 
"protein factor" p [7l|, [ill . These factors allow us to con- 
trol and tune the various sources of noise in the system. 
For example, increasing 77 increases the rates of gene tran- 
scription, resulting in larger mRNA populations and less 
mRNA- related "shot noise," i.e., noise arising from the 
fact that the system is comprised of discrete numbers of 
interacting entities (in electrical circuits, shot noise arises 
from discrete numbers of charge carriers; in optical de- 
vices, from discrete numbers of photons) [tiI l72j|. The 
translation rate is divided by 77, however, thus cancelling 
out the effect of increased mRNA levels on the protein 
production rates. Protein-related shot noise is controlled 
similarly through the protein factor p while the amount 
of "telegraph noise," i.e., that associated with the ran- 
dom switching between the ON and OFF states of the 
genes (reminiscent of an electronic telegraph transmitting 
Morse code) (tT], [T^I , is controlled through the parameter 
7- 

In this article, we focus primarily on the telegraph fac- 
tor 7. We do so because the performance of the leap- 



TABLE II: Basic form of one-third of the full repressilator 
model (all three genes are considered equivalent) [l^. 
represents one of the three repressilator genes {lad, tetR, or 
\-d) and Pr the corresponding repressor protein (Lad for 
tetR, etc. — see Fig. [TJ. rrix and Px represent the mRNA and 
protein products of Qx, respectively. All reactions are treated 
using simple mass-action kinetics and all parameters with in- 
verse molar units (M~^) are divided by Na^ prior to runtime. 
k\,k2 are rate constants for forward repressor binding while 
k-i,k-2 are for the reverse reactions. Also shown (see text 
for explanation) are the "telegraph factor" 7, the "RNA fac- 
tor" r^, and the "protein factor" p (equivalent for all genes) 



Here, we set ?7 = p = 1000 and vary 10 <7<1. 



Reaction 



Parameter Value 



T,2. 


Qx +Pr ^ {gx-Pr} ' 


ki = 
k-i 


10^7/p M-^s"^ 
= 224.O7 s"^ 


3,4. 


{gx -Pr } +Pr ^ {gx -Pr -Pr } 


'k2 = 

k-2 


10^7/p M-^s"^ 
= 9.O7 s~^ 


5. 


gx ^ gx + rux 


k3 = 


0.5'n s"^ 


6. 


{gx-Pr} {gx-Pr} + rUx 


ki — 


SxlQ-^T? s"^ 


7. 


{gx-Pr-Pr} {gx-Pr-Pr} + nix 


ks = 


SxlO^^T? s"^ 


8. 


trix rUx + Px 


k(i = 


0.167p/T7 s"^ 


9. 


mj; — > 


k7 = 


ln(2)/120 s"^ 


10. 




kg = 


ln(2)/600 s-i 



Initial conditions: 

[rriTetR] = 3.8 fiM; [mci] = 8.1 ^J.M■, [rriLacl] = 0.15 /xM; 
[PTetfl] = 0.22 mM; [pci] = 2.4 niM; [pLad] = 0.20 mM; 
QTstR = 9CI = 9 Lad = 1 (molecule); 
all {gx-Pr} and {gx-Pr Pr} = 0. 



ing algorithm is strongly affected by this parameter: at 
small values the method performs exceptionally well but 
falters as it is increased, approaching the performance 
of the SSA (see Sec. IIV Ap . With the system volume 
fi = 1.4x10^^^ 1 (the volume of a typical E. Coli cell) and 
rj and p set to high values (i.e., 1000) in order to dampen 
the mRNA- and protein-related noise sources, we inves- 
tigate how the system behavior changes for 10"'* < 7 < 1. 
We thus observe how the actual values of the gene-protein 
binding and unbinding rate constants, as opposed to sim- 
ply their ratios, affect the overall dynamical behavior of 
the system as well as the performance of the PLA. 

We also find it convenient to investigate a reduced form 
of the repressilator model obtained by applying the "par- 
tial equilibrium assumption" (PEA) to the first four re- 
actions in Table |TT1 Assuming each reversible reaction to 
be in rapid equilibrium, simple algebra leads to effective 
rate expressions of the Adair form (TSj for mRNA pro- 
duction from the free, singly-bound and doubly-bound 
genes (see Appendix \K\ for derivations). These expres- 
sions are strictly valid in the limit 7 — > cx3. Doing so 
reduces the 30 reactions of Table [ll] to 18 in Table IIIII 
Note that the reduced model in Table IIIII differs from 
the "deterministic" model of Elowitz and Leibler [l^ in 
that the expressions in Table IIIII are directly derivable 
from the reactions in Table [Til via application of the PEA 
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TABLE III; Basic form of one-third of the reduced repressila- 
tor model. Parameter values are the same as in Table HIl The 
Adair functional forms [73] describing mRNA production are 
similar to the well-known Hill forms, but are formally correct 
for 7— >oo (see Appendix |A)| . 



Reaction 



1- ^ gx + nix 

2. {gx-pr} {gx-pr} + rrix 

3- {gx-Pr-Pr} {gx-Pi-Pr} + TUx 

4. nix nix + Px 

5. nix 

6. ^ 



Rate Expression 



k3K^K2/f{[pr]) 
k4K2[pr]/f{[pr]) 
k5[Pr?/fi[Pr]) 

kelnix] 
krlnix] 

ks [px] 



K, = k^i/k„ (i = l,2) 

f([pr]) = KlK2+K2[pr] + [pr? 

while those in Ref. [l^ are not. 

III. METHODS 




FIG. 2: Example calcium-oscillations time course and the 
Gaussian fits obtained using the peak-analysis software em- 
ployed in this work. Results are for a system volume — 
10"^^ 1. Also shown are peak and peak-complex bracket 
points identified by the fitting algorithm. Notice that stochas- 
tic effects lead at this small volume to the identification of a 
fourth peak in the second peak complex. (Inset) Blown up 
view of the second and third peaks in the second peak com- 
plex. Squares correspond to where fitting began, diamonds 
to where fitting concluded. 



To carry out our investigations, we use the parti- 
tioned leaping algorithm ^451] . an extension and vari- 
ant of the T-leaping method of Gillespie and co-workers 
[H, m, m, nil. The PLA operates on the same basic 
principles that underlie all leaping methods: calculate a 
time step t over which all reaction rates in the system re- 
main "essentially constant" and then determine the num- 
ber of times each reaction fires within that interval by 
sampling from an appropriate probability distribution. 
The primary difference between our approach and other 
leaping algorithms is that we utilize the entire theoreti- 
cal framework developed by Gillespie [H, for bridg- 
ing from the discrete-stochastic description of reaction 
dynamics to the more familiar continuous-deterministic 
representation p5| . 

At each step of a PLA simulation, reactions are par- 
titioned, using theoretically-sound criteria [H, [E^, into 
four categories based on the calculated time step and 
the current reactant population levels. The categories 
correspond to different levels of approximation; reac- 
tions with small reactant populations garner a detailed 
"exact-stochastic" classification (i.e., a SSA treatment) 
while those with larger populations receive coarser de- 
scriptions. The coarse classifications range from "Pois- 
son" to "Langevin" to "deterministic," with the levels 
of stochasticity decreasing and approximation increasing 
as one moves up the hierarchy. The result is a truly 
multiscale method, where fluctuations associated with 
rare events are correctly described while frequent events 
"leapt" over multiple reaction firings at a time. As such, 
the PLA accomplishes what so-called hybrid methods 
HE 113, m, \M, m, HE US, m, aim to do but in a 
much more simple and theoretically-sound way. For even 
moderately-sized systems the computational gains of the 
PLA relative to the SSA can be significant J45|. We refer 



the interested reader to Ref. 45] for further details re- 
garding the theoretical foundations and implementation 
of the PLA. 

The promise of the PLA, and leaping algorithms in 
general, is that long-time stochastic simulations can be 
performed, allowing for large-scale data collection and 
quantitative statistical analyses of the resulting time se- 
ries. However, leaping simulations produce noisy time- 
evolution trajectories and automated data collection 
tools that can compensate for noise are needed. Here, we 
use in-house time-domain peak-analysis software for this 
purpose. Borrowing ideas from the automated identifica- 
tion of peaks in mass spectral data [t^, [zE] , the software 
identifies "significant" peaks within a time series and fits 
Gaussians to the data in order to wash out the noise. An 
example calcium-oscillations time series and the Gaus- 
sian fits achieved using the peak-analysis software are 
shown in Fig. [21 

Using this tool, we collect large amounts of peak am- 
plitude and peak-to-peak distance data from simulated 
time series and perform various statistical analyses. We 
calculate averages and variances from long-time PLA and 
deterministic simulation runs (see [76| for an explanation 
of variance in the deterministic case) and perform z-tests 
on the differences in means and F-tests on the ratios of 
variances [tJ- We also calculate coefficients of variation 
(COVs), defined as the ratio of the standard deviation to 
the mean [3| , in order to quantify the relative importance 
of the noise. Finally, we put the data into the form of 
smoothed histograms [i^ and calculate "histogram dis- 
tances," D, and "self distances," D^^f [H, III], so as to 
account for any particulars in the shapes of the distribu- 
tions (e.g., long tails, bimodal features, etc.). We do all 
of this for various system properties (i.e., volumes, tele- 
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graph factors) in order to quantify changes in the system 
behavior and to identify points of transition to determin- 
ism. 

AU PLA simulations reported in this work were per- 
formed using the parameters '« l'=3, '3> l'=100 [i^ and 
using the species-based r-selection procedure of Cao et al. 
[3^ . as modified in Harris and Clancy (i^, with "error 
control parameter" e = 0.03. These represent "typical" 
values for these parameters. Derivations of the gi val- 
ues [13, used in r selection for the Michaelis-Menten 
and Adair reactions of Tables H] and IIIII are given in Ap- 
pendix |B] It should also be noted that an attractive fea- 
ture of the PLA is its ability, via simple manipulation of 
the classification parameters, to force simulations at any 
level of description (45l |. Thus, both deterministic and 
exact-stochastic simulations reported in this work were 
performed using the same code as for the PLA simula- 
tions. The PLA segues to an explicit Euler method in 
the deterministic limit and to the next-reaction method 
[2JI in the exact-stochastic limit. 



IV. RESULTS 
A. Calcium Oscillations 

The periodic-bursting regime of the Kummer et al. [59| 
calcium-oscillations model (Table |l| is characterized by 
large-amplitude complex oscillations in which the Ca^+ 
repeating unit is a three-peak complex. In Fig. [3l we 
show example time courses at three different system vol- 
umes spanning four orders of magnitude obtained us- 
ing the PLA. Also shown are the classifications achieved 
along the time courses for the reaction Ga + Ca Ca 
(Table U reaction 4). The classifications range from 1-4, 
with 1 being the finest level of description (exact stochas- 
tic) and 4 the coarsest (deterministic). 

The plots in Fig. [3] starkly illustrate why this system 
is ideally suited for treatment via the PLA: the classifi- 
cations oscillate in time along with the reactant species 
populations. When the Cs?^ population is small we see 
that the reaction gets classified at the exact-stochastic 
level, while coarser descriptions are employed when the 
population is large (similar behavior is seen for other re- 
actions in the system as well — data not shown). As such, 
the PLA is able to accurately capture stochastic effects 
that arise in this system when the species populations be- 
come small without suffering from the characteristic in- 
efficiency of the SSA when the populations become large. 

This is evident in Fig. 21 where we show results of a 
step and timing analysis comparing the performance of 
the PLA to the SSA. As expected, we see a linear increase 
in the computational expense of the SSA with increasing 
system size (see Fig. H caption) 0,|2l[lii- The PLA, 
on the other hand, exhibits more complex behavior, with 
the expense initially remaining constant, then increasing 
slightly, going through a maximum dX ^^1 = 10""'^* 1 and 
then dropping off sharply before finally leveling off. Inter- 



estingly, similar behavior was seen for the simple example 
systems in Ref. [45|, which were specifically designed to 
showcase the strengths of the algorithm. Most impor- 
tantly, however, is that Fig. 2] clearly illustrates that for 
all but the smallest system size considered the PLA far 
outperforms the SSA, by as many as eight orders of mag- 
nitude in simulation steps at = 10~^^ 1. It is these types 
of accelerations that make quantifying stochastic effects 
in this system possible, something that was unachievable 
in Ref. [63| because of the limitations of the SSA. 

Our statistical results are shown in Fig.O In all cases, 
we compare results obtained from both PLA and SSA 
simulations to deterministic predictions for the three at- 
tributes considered, namely, first-peak amplitudes, first- 
to-second intra-complex distances, and first-to-first inter- 
complex periods (see Fig. [Sj top-left panel). In the case 
of the SSA, we were only able to obtain data for the three 
smallest system sizes considered because of the compu- 
tational expense of the method. 

In Figs. Eh-El!;, we compare averages and modes ob- 
tained from the PLA and SSA to deterministic predic- 
tions. The results are shown as percent deviations from 
determinism. In all cases, we see small yet statistically 
significant deviations from determinism at small volumes 
and, in the case of the PLA, a rapid convergence to the 
deterministic limit with increasing system size. Close in- 
spection reveals that full convergence is achieved for all 
attributes by ri = 10^^^ 1. It is also clear in Figs.jSla and 
0> that there are discrepancies between the PLA results 
and the SSA results. The discrepancies are small, how- 
ever, on the order of 1% or less in all cases, and decrease 
with decreasing e (data not shown). Interestingly, there 
are virtually no discrepancies between the PLA results 
and the SSA results in Fig. [Sjc, the inter-complex peri- 
ods. We cannot at present explain why the PLA achieves 
greater accuracy for this attribute over the others. Un- 
derstanding the sources of error in leaping algorithms and 
developing strategies for attenuating them is an area of 
current interest [23, [13] ■ Suffice it to say that in this case 
the PLA achieves very good to excellent accuracy for all 
quantities considered. 

In Figs. [Sji and [5]e, we consider the distributions of 
the attributes. Figure [Sji shows data for standard devi- 
ations, a simple point statistic, while Fig. [5]e considers 
the shapes of the distributions through the histogram 
distance [isl In Fig. [51ff, we see almost perfect cor- 
respondence between the PLA and the SSA results. In 
Fig. [5ls, however, we see discrepancies in the histogram 
distances for the amplitude and the intra-complex dis- 
tance (see inset). Taken together, along with Figs. [5la 
and [3?, this indicates that the PLA is accurately captur- 
ing the shapes of the distributions but they are shifted 
slightly relative to those obtained with the SSA. 

As far as convergence to determinism, both Figs. [5ltf 
and[5]e give the same result: the different attributes con- 
verge to the deterministic limit at different rates and with 
different transition points. The intra-complex distance 
converges the fastest, followed by the peak amplitude 
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FIG. 3: Example Ca'^"'" periodic-bursting time courses and associated classifications for Ga + Ca— >Ca (Table H] reaction 4) 
obtained using the PLA at three different system volumes. Classifications are: (1) exact stochastic, (2) Poisson, (3) Langevin, 
(4) deterministic. Also shown {top-left panel) are the three system attributes investigated: first-peak amplitudes, first-to-second 
intra-complex distances and first-to-first inter-complex periods. 
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FIG. 4: Average numbers of steps from (a) and total CPU 
times for (6) 10 000 PLA and SSA simulation runs of 20 s 
for the Kummer et al. [s^ core model for calcium oscillations 
(Table HI). Both graphs are shown as log-log plots. SSA values 
at n = 10"^° and 10"^^ 1 are based on 1000 and 100 simulation 
runs, respectively. SSA values at > 10~^* 1 are extrapola- 
tions (not based on actual data). Note that the PLA steps 
and CPU times go through maxima at ~r2=10"^* 1. Similar 
behavior was observed for the example systems in Ref. [isl ]. 
Also note that in the case of the SSA, the linear relation ship 
between computational expense and system size [l9l . [20l . l2lj . 
which has the form y — mx, with m being the slope (the 
y-intercept is zero since, obviously, a system of zero size re- 
quires zero computational effort), appears here as a line with 
a slope of unity and y-intercept of logjQ(m). All simulations 
were performed on a 3.60 GHz Pentium Xeon processor. 



and finally the inter-complex period. The amplitude and 
intra-complex distance statistically converge to the deter- 
ministic limit at f2 = 10~^^ 1 while the period converges 
at 10~^^ 1. These convergence points differ from those 
for the averages by one to three orders of magnitude (cf. 
Figs. [5]3-[5]c) and indicate a persistence of noise in this 
system at volumes much larger than expected based on 
the analysis of Ref. 63] . 



Finally, in Fig. [5]r we consider the relative "impor- 
tance" of the noise through the coefficient of variation 
(GOV). The idea is that even if noise in an attribute is 
significant from a statistical perspective it might be so 
subtle as to be of little practical import. For example, 
in this case we see that for ft > 10~^° 1 the COVs for 
all attributes are less than a few percent (the discrep- 
ancies between the PLA and the SSA seen in Figs, ^jp, 
and are virtually indiscernible on this scale). The 
noise effects clearly persist up until 10""'^'^ 1 (as seen in 
Figs. [Slff and [5ls as well) but it seems unlikely that in 
any realistic setting, e.g, an embedding within a larger 
"whole-cell" model, they would be of much practical con- 
sequence. Whether or not this is true (it is debatable 
[lil3)' it is certainly the case that it would be difficult, 
if not impossible, to perceive these effects visually. This 
explains, therefore, why Kummer et al. (63j reported the 
stochastic-to-deterministic transition point for this model 
to be at ^10~^° 1 (tens of thousands of Ca^"*" ions). Our 
results thus largely corroborate their claim that a de- 
terministic treatment is justified for volumes larger than 
this. 
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FIG. 5: Statistical results for the Ca periodic-bursting regime. Results of PLA and SSA simulations are compared to 
deterministic (DET) predictions. PLA results are shown as empty symbols (circle, square, diamond) connected by lines. SSA 
results are shown as disconnected, filled symbols. PLA and SSA points designated with the same symbol correspond to the 
same quantity [e.g., in (a), the PLA averages are given as empty circles while the SSA averages are shown as filled circles]. 
All PLA and deterministic values are based on over 10 000 collected data points. For the SSA, over 10 000 data points were 
collected for t7=10"^^ and 10"^° 1 and ~ 1000 were collected for ^ = 10"^^ 1. No SSA results are given for n>10"^* I due to 
computational expense, (a)-(c): Deviations from determinism, shown as percentages [({PLA or SSA}— DET)-^DETx 100%], 
for averages and modes of Ca^""" first-peak amplitudes (Amp 1), first-to-second intra-complex distances (Intra 1-2), and first- 
to-first inter-complex periods (Inter 1-1), respectively (see Fig. O top-left panel). Dashed lines denote 95% confidence intervals 
on the PLA averages [difficult to see in (b) and (c)] . Note that long-tailed distributions lead to averages and modes on opposite 
sides of the deterministic predictions at small volumes in (b) and (c). (d): Ratios of standard deviations ({PLA or SSA}-^DET) 
for the three attributes in (a)-(c). Dashed lines denote 80% confidence intervals (because of the relative weakness of the F-test 
[TTjl. (e): Histogram distances ({PLA or SSA} vs. DET) for the three attributes in (a)-(c). The dashed line denotes twice 
the deterministic self distance (2 x E[_DQEq^]) (see [Z^). The self distances for all three attributes are essentially identical in 
this case. (/): Coefficients of variation (COVs) obtained from PLA and SSA simulations, shown as percentages (standard 
deviation-faveragex 100%), for the three attributes in (a)-(c). Deterministic limits are given as dashed lines. 



B. Repressilator 

Our analysis of the repressilator focuses on behavioral 
changes that arise when the intermittent rates of switch- 
ing between the transcriptional ON and OFF states of 
the genes are varied. The parameter that controls this is 
the telegraph factor 7. From an intuitive standpoint, we 
expect to observe large deviations from determinism at 
small values of 7 and a convergence towards determinis- 
tic behavior with increasing 7 because of the "averaging 
out" of the states of the genes Moreover, by making 
the RNA and protein factors, rj and p, large we minimize 
the effects of shot noise. However, we cannot eliminate it 
completely, and thus we expect to encounter some resid- 
ual effects. Finally, we also expect that the PLA simula- 
tions will begin to bog down as 7 is increased because of 
the growing disparities between the gene-protein binding 
and unbinding rates and the rates of all other reactions 
in the system [45| . 

In Figs. [5Hni these expectations are realized. In Fig. [SI 



we show example time courses for TetR protein (taken 
as representative of the system behavior) that illustrate 
how "deviant effects" Q arise in the repressilator at small 
values of 7. With 7 = 10"'*, we see in Fig. |6]that the true 
behavior of the system, as captured by both the PLA 
and the SSA, differs markedly from that predicted de- 
terministically. Rather than emitting smooth and regu- 
lar oscillations, the system produces large-amplitude in- 
termittent "bursts" of (mRNA and) protein production. 
This is a direct consequence of the slow stochastic switch- 
ing between the ON and OFF states of the genes and is 
consistent with gene -expression behavior often observed 
in eukaryotes Note that due to stochasticity the 

PLA and SSA traces differ from each other. As we shall 
see, however, they are virtually identical from a statisti- 
cal standpoint. 

In Fig. [21 we present results of our statistical analyses 
of the repressilator. At various values of 7, as well as 
at the Adair limit (7 00), we compare the stochastic 
behavior of the system, as captured by both the PLA and 
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FIG. 6: Example time courses (TetR protein) illustrating "de- 
viant effects" 10] in the repressilator at small values of 7. With 
7 = 10~* and r) = p = 1000, stochastic realizations (PLA and 
SSA) differ markedly from the deterministic prediction. 



the SSA, to deterministic predictions. In Figs. [TJa and 
[7?), we consider averages and modes for the TetR-protein 
peak ampUtude and period, respectively. In both cases, 
the PLA and SSA results coincide almost perfectly and 
show large deviations from determinism at small values 
of 7 and a convergence towards the deterministic limit 
with increasing 7. Close inspection of the PLA results 
reveals that statistical convergence to the deterministic 
limit is achieved for both attributes by 7 = 1. It is also 
evident from these plots that the behavior of the full 
model (Table approaches that of the reduced model 
(Table [TIB with increasing 7, as we would expect. 

In Figs. [TJc and [TJi, we consider the distributions of 
the amplitude and the period. Again, we look at ra- 
tios of standard deviations and histogram distances and 
again we see a convergence towards determinism with in- 
creasing 7. However, in this case the deterministic limit 
is never reached; even at the Adair limit we see consid- 
erable deviation from determinism. Furthermore, we see 
very good correspondence between the PLA and the SSA 
results. In fact, the only significant differences that we 
see are the small discrepancies in the histogram distances 
at the Adair limit in Fig. [Tiff. This is interesting in light of 
the discrepancies seen between the PLA and the SSA in 
Figs. [51a and[5?* for the calcium-oscillations model, which 
also contains reduced reaction types (see Table [T| . This 
suggests that reduced reactions might be the source of 
the various inaccuracies seen in Figs. [5land[71 We plan 
to investigate this issue further in the future. 

In Fig. [7ls, we consider the noise strength through the 
GOV. Here, as in Figs. [71c and [Tiff, we see almost perfect 
agreement between the PLA and the SSA results and an 
incomplete convergence towards the deterministic limit 
with increasing 7. It is clear, therefore, that significant 
shot noise effects persist in this system even as 7 ^ 00. 
Moreover, it is interesting to note the elevated levels of 
noise in the amplitude as compared to the period. We see 



an approximately order-of-magnitude difference in the 
COVs for these two attributes at all values of 7 > 10"'' 
and at the Adair limit. Contrast this with Fig.lSjf, which 
shows no appreciable difference between the COVs for 
the amplitude and the period in the calcium-oscillations 
model. This is an example of the type of fine-level insight 
that we can garner via the leaping algorithm. 

It is clear from Figs.[7l3-171s that the repressilator never 
behaves in a fully deterministic manner under the con- 
ditions that we consider. However, it is also clear that 
the behavior does approach that of the reduced model 
with increasing 7. Therefore, in Fig. [51 we quantify this 
convergence to the Adair limit by repeating the statisti- 
cal tests of Figs. [71c and [Tiff but using the PLA and SSA 
results for the reduced model, rather than the determinis- 
tic results at each 7, as our reference. The results clearly 
confirm the (near) convergence of the system behavior to 
the Adair limit at 7 = 1. 

Finally, in Fig. [51 we present results of a step and tim- 
ing analysis comparing the performance of the PLA to 
the SSA for simulations of both the full (Table [IT]) and 
reduced (Table IIIip repressilator models. For the full 
model, we see the convergence in computational expense 
of the PLA and the SSA that we anticipated jj^. In 
Fig. [HfJ, the numbers of steps required for PLA and SSA 
simulations converge asymptotically with increasing 7. 
In Fig. [5^), we see a similar trend for the CPU times, 
although interestingly the curves here cross at 7 = 1 be- 
cause each PLA step is more computationally expensive 
than each SSA step. Also worth noticing is that both 
plots indicate that the expense of the SSA decreases with 
increasing 7 while the opposite is true for the PLA. This 
is because the protein (and mRNA) populations, which 
are the prime bottleneck for the SSA, tend to be larger 
at small values of 7 (cf. Figs. [SI and Wp)- Leaping al- 
gorithms are not affected by population sizes, however, 
having been developed specifically to cope with this prob- 
lem [2J, [5^1 . Hence, we see that when stochastic effects 
in this system are most pronounced (small 7) the PLA 
far outperforms the SSA. 

In Ref. [4^, it was posited that large disparities in 
rate constants would prove to be the prime bottleneck 
for leaping algorithms. This is confirmed in Fig. [9l by 
the declining performance of the PLA with increasing 
7. It is for exactly this reason that we consider the re- 
duced model of Table IIIII In Figs. [3 and [51 we have 
seen that the behavior of the full model approaches that 
of the reduced model with increasing 7. Now, in Fig. [51 
we see that the performance of the PLA is greatly en- 
hanced by the model reduction. Depending on exactly 
how we choose to implement the PLA (see [80[ for an ex- 
planation) , we can achieve gains of between one and four 
orders of magnitude in both simulation steps and run 
times. Additionally, it is important to note that reduc- 
ing the model has very little effect on the performance of 
the SSA. In fact, we see in Fig.lHlthat while the numbers 
of simulation steps required for the SSA remain virtually 
unchanged upon reducing the model, the CPU time ac- 
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FIG. 7: Statistical results for the repressilator. At various values of the telegraph factor 7, and at the Adair limit (7— > cxd), results 
of both PLA and SSA simulations are compared to deterministic (DET) predictions. As in Fig. [S] PLA (and deterministic) 
results are shown as empty symbols connected by lines, SSA results are shown as disconnected, filled symbols, and PLA and 
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tually increases by ~ 50% because of the higher complex- 
ity rate expressions in Table Hill which impose additional 
computational burdens on the algorithm. Our results in- 
dicate, therefore, that there is a distinct advantage to 
using model reduction in conjunction with leaping which 
is absent with regards to the SSA. 



V. DISCUSSION AND CONCLUSIONS 

Using the scarcity of published practical applications 
of leaping as a backdrop, our aim in this article was to 
investigate the performance characteristics of a partic- 
ular leaping algorithm, the PLA, when applied to two 
non-trivial biological models under a variety of condi- 
tions. Our hope was to identify the conditions under 
which leaping proves particularly beneficial and where it 
falters and, hence, provide a kind of guide that will aid 
and motivate future applications of the method. Our use 
of the PLA, as opposed to a different leaping algorithm, 
was based purely on expediency, given that we developed 
the method 'i^. However, the PLA operates on the same 
basic principles as all leaping algorithms and its perfor- 



mance can thus be seen as generally reflective of the en- 
tire class of method. In other words, the accelerations 
that we have reported here are not wholly unique to the 
PLA but are characteristic of the leaping methodology in 
general. Similarly, the bottlenecking that we experienced 
in the face of fast-reversible reactions in the repressilator 
system can be expected to afflict all leaping algorithms. 

That being said, there are attributes of the PLA that 
we believe set it apart from its various counterparts, and 
we would be remiss in not emphasizing these. Fore- 
most among these is its simplicity of implementation and 
ease of use. The algorithm is concise, straightforward 
and overcomes various technical difficulties (e.g., negative 
populations [sil [3^ . Issl ) without the need for extensive 
auxiliary machinery [45|. Using the PLA requires little 
more than a system definition (reactions), rate expres- 
sions (elementary or non-elementary) and definition of 
three simple model-independent parameters ['~1', '3>1', 
'<Cl' (i.e., e)] [4^. Also significant is the ability to force 
the algorithm to perform both deterministic and exact- 
stochastic simulations by simple manipulation of the clas- 
sification parameters (e.g., setting 'KiV=oo or '^l'=0). 
In our case, this significantly simplified the noise quan- 
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times for (6) 1000 PLA and SSA simulation runs of 30 000 s 
of the full (Table |IT]| and reduced (Table IIIip repressilator 
models. All SSA points are based on 100 simulation runs 
(due to computational expense). Note that the CPU curves 
in (b) cross at 7 = 1 because each PLA step is more expensive 
than each SSA step. At the Adair limit, results are given 
for PLA simulations that both include and exclude the exact- 
stochastic (ES) classification (see [131). AH simulations were 
performed on a 3.60 GHz Pentium Xeon processor. 



tifi-cation and step and timing analyses. 

Tangibly speaking, our results clearly illustrate the 
great potential that leaping methods hold in computa- 
tional systems biology. For both the calcium-oscillations 
model and the repressilator, we observed orders-of- 
magnitude accelerations relative to the SSA (Figs. [4] and 
|9]) that made quantifying stochastic effects in these sys- 
tems possible. In the calcium-oscillations case, this gave 
us access to subtle effects of stochasticity that would have 
been indiscernible otherwise (Fig.[5|). For the repressila- 
tor, we actually saw the greatest gains in situations where 
stochastic effects were most prevalent (small 7 — Fig. [7]). 
This is a particularly intriguing result. Gene regulation is 
a common feature of many biological models and our re- 
sults indicate a great potential advantage to using leaping 
in cases of slow transcription-factor binding and unbind- 
ing (such as observed in Ref. [l^). 

A critical aspect of the present study was our ability 
to identify conditions under which the leaping algorithm 
did not perform particularly well. In many ways, this 
may be more valuable in terms of advancing the use of 
leaping methods than is highlighting its strengths. The 
leaping algorithm clearly falters when applied to the full 
repressilator model (Table |TT| with large telegraph fac- 
tor 7 (Fig. [9]) . Intuitively, it is easy to understand why 
this is. The basic strategy underlying all leaping algo- 
rithms is to allow, at each simulation step, as many re- 
action firings as possible without the reaction rates in 
the system changing "appreciably" [U HI, \E^- How- 
ever, in this case there is only a single copy of each gene. 
Thus, only a single binding/unbinding event is possible 
at each simulation step because one firing changes the 
binding/unbinding rates from either finite values to zero 
or vice versa, which is obviously appreciable. When 7 
is small, this is not a problem because the time inter- 
val between successive binding and unbinding events is 
large enough so that many transcription, translation and 
degradation reactions can fire. When 7 is large, however, 
this is no longer the case. The numbers of reaction firings 
become limited due to the high frequency of binding and 
unbinding, and in the extreme limit the effect is such that 
the performance of the algorithm approaches that of the 
SSA (i.e., one reaction firing per step — Fig. [9]). We can 
generalize this observation by saying that small reaction 
subnetworks (pairs of reversible reactions in this case) 
that have small populations and large rate constants are 
prime bottlenecks for leaping algorithms. 

Fortunately, our results also illustrate how one can 
surmount such problems. By applying a simple rapid- 
equilibrium assumption to the first four reactions of Ta- 
ble nil we were able to recover the behavior of the full 
model for 7 > 1 (Fig. [51) at significantly reduced compu- 
tational cost (Fig. [9]) [80] . This includes accurately cap- 
turing stochastic effects associated with finite numbers 
of mRNAs and proteins. Interestingly, we also showed 
that reducing the model has little effect on the perfor- 
mance of the SSA (Fig. [9]). Thus, the chief benefit to 
using model reduction in this case was not in reducing 
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the number of reactions that had to be considered, but 
rather in increasing the size of the time step that could 
be traversed at each simulation step. This is a different 
perspective on the issue than is usual and strongly sug- 
gests that leaping and model reduction should be viewed, 
not as alternative approaches to the problem of timescale 
separation (as is common), but as complementary. Inte- 
grating leaping with advanced model-reduction schemes 
(e.g., [8li [si, il, [H, ii, mi, 113, HI) is thus an area of 
great future interest. As a final note, we did observe some 
(small) disagreement between the PLA and the SSA re- 
sults (Figs. [Hh, \St and [TJi) which may be due to the 
inclusion of the reduced reaction types. This is an issue 
that will be investigated further in the future. 



So, given the effectiveness of the leaping algorithm as 
demonstrated in this article, why are practical applica- 
tions of leaping so scarce in the literature? The answer is 
likely multifaceted. First, the approach is relatively new 
and many researchers may simply be unaware, or only 
vaguely aware, of its existence. Second, newer incarna- 
tions of the method are becoming increasingly complex, 
to the point that, even if aware of their existence, a non- 
expert may be unable to implement them. Third, it has 
been our experience that there is a common mispercep- 
tion that stochastic simulation algorithms can only be 
applied to sets of elementary reaction types. Indeed, 
it is common practice when investigating the stochas- 
tic characteristics of an established biochemical model 
to first "deconstruct" it into elementary reaction steps. 
While this is not, in fact, strictly necessary, it is possi- 
ble that attempts to use leaping algorithms in this way 
have befallen the bottleneck of fast-reversible reactions 
illustrated in Fig. [9l 



It is our hope that this article alleviates, to some ex- 
tent, each of these hindrances to the expanded use of 
leaping algorithms in computational systems biology. For 
we believe that the future of these methods is bright. 
Leaping methods represent a small but important piece 
of the larger puzzle that is comprehensible and action- 
able models of complex biochemical processes. Coupled 
with advanced model-reduction techniques that address 
the problem of rate constant disparities, they can pro- 
vide a sound and practical means by which the problem 
of timescale separation in biological systems can be over- 
come. Further, imbedded into larger modeling and simu- 
lation frameworks that include methods for addressing 
combinatorial complexity 89], spatial localization [90| 
and parameter uncertainty [911 . 192| , the promises of in sil- 
ica biology (gsj might finally be within reach. 



APPENDIX A: THE ADAIR REDUCTION 

For large telegraph factor 7, the gene-protein binding 
and unbinding reactions 
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■yki 



{g-r-r}, 



(Al) 
(A2) 



from Table nil can be assumed to be in rapid equilibrium. 
[Here, we use simpler notation for convenience: g for 
the gene promoter, r for the repressor protein and m for 
mRNA (below).] Setting the forward and reverse rates of 
reaction pairs lAl] and IA2| equal to each other, it is easy 
to show that 



[g] - KiK2[g-r-r]/[r]' 



(A3) 



where Ki = k-i/ki and [•] denotes concentration (or, more 
correctly, occupancy probability). Assuming that the to- 
tal number of genes, gx, is constant. 



5T = [g] + [g-r] + [g-r-r], 
simple algebra leads to 



[5] = 



KiK2gT 



KiK2+K2[r] + [rY' 
It is then straightforward to show that 

K2[r]gT 



[g-r\ 



[g] 



Ki K1K2 + K2[r] + [rY 
[r?gT 



The mRNA transcription reactions are 



g 

{g-T} 

{g-r-r} 



k'' 



g + m, 

{g-r} + 

{g-r-r} 



m. 



(A4) 
(A5) 

(A6) 
(A7) 

(A8) 
(A9) 
(AlO) 



The effective rate expressions for mRNA production 
given in Table IIIII are obtained by multiplying the rate 
constants in reactions IA8HAIOI by the expressions in 
Eas. lA5HA7l These effective expressions are of the Adair 
form [73| . 

Notice that Eq. lA7l is similar in form to a Hill equation 
[7^ with a Hill coefficient of +2 (positive cooperativity) , 
except for the K2[r] term in the denominator. However, 
this term vanishes as [r] — *■ 0, dominated by K1K2, and 
is overwhelmed by [r]^ as [r] -^00. Thus, this term can 
effectively be ignored in all cases other than a small range 
of intermediate levels of [r] . This can be seen, therefore, 
as a type of "derivation" of the Hill equation. Similarly, 
Eq. IA5I can be rewritten as 



[g] = 



9T 



r]-2 + {^KM)-^ + iVKJ^^y 



(All) 
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which is similar to a negative-cooperativity Hill equation 
with a Hill coefficient of —2, again except for the second 
term in the denominator. Note that the expression for 
[g-r] in Eg. IA6l is intermediate between Eqs. lATI and lAll] 
and does not have an analogous Hill form. 



APPENDIX B: MICHAELIS-MENTEN, ADAIR, 
AND r SELECTION 



At each step of a PLA simulation, a time step t is cal- 
culated based on the current state of the system. This 
time step corresponds to the time interval over which 
we expect the reaction rates ("propensities" in the 
stochastic jargon) for all reactions in a system to remain 
essentially constant [H, [H, [H, . We quantify the con- 
cept of "essentially constant" by imposing a constraint on 
the relative change of each reaction propensity [s^ . 

\a^{t + T)-a^{t)\/a^{t)^e (0 < e « 1). (Bl) 

In practice, there are two approaches for determining t. 
The first is a "reaction-based" approach in which the con- 
straint in Eq. lBll is used directly |23i, i28j, i34ii] . The second, 
which is used in this article, is a "species-based" approach 
where the relative changes in each reactant population 
Xi are constrained such that Eq. IBll is satisfied for all 
reactions 1341. 



\X,it + T)-X,{t)\/X,{t)^e/g,. 



(B2) 



Equation IB3I shows that if we constrain |AXi|/Xj = e, 
then \Aa^\/a^ < e, i.e., gi = 1 will sufhce. Similarly, for 
2nd-order MM reactions. 



Ac 



|Aa„ 



< 



Xi + Cm 

9A-^^^ + 

Cm 

'm 



XiXj 



dX, 



■AA„ 



X, + C 



\AX, 
X, 



JAA, 
A, 



(B4) 



Thus, if we constrain both |AA^|/A.i = |AAj|/Aj = e/2, 
then |Aa^|/ap<e, i.e., 5^ = 3^=2. 

The Adair reactions in Table IIIII represent effective 
rates of mRNA production from the unbound, singly- 
bound and doubly-bound gene. In general terms, we can 
think of the rates of mRNA production from an n-bound 
gene with m binding sites (0<n<m). We then have 



(B5) 



where r is the repressor protein population, gx is the total 
number of genes (unity in this case), and Ci=KiXNA^- 
It is easy to show that for m = 2 and n = {0, 1, 2}, Eq. IBS 
reduces to the expressions given in Table [Till 
Following the same procedure as above, we get 



Here, e is the same as in Eq. IBll and gi depends on the 
highest-order reaction species Si is involved in. 

Procedures for determining gi which account for ele- 
mentary reaction types up to third order are given in Cao 
et al. [34| and, in modified form, in Harris and Clancy 
[4^. Basically, each reaction type has associated with 
it a value of gi for each reactant species Si. Thus, one 
merely sifts through all reactions in which Si appears 
as a reactant and sets gi equal to the largest of these 
values. This need be done only once, at the outset of 
a simulation. In this article, we consider non-elementary 
reactions of the Michaelis-Menten (MM) and Adair types 
(Tables U and |TTT| and must, therefore, derive appropriate 
gi expressions for them. 

In Table [H we consider two different types of Michaelis- 
Menten reactions, which we can term Ist-order (e.g., 
PLC* ^ 0) and 2nd-order (e.g., G„-f PLC* PLC*) 
MM types. It is easy to show that gi for each of these 
is the same as for the corresponding elementary reaction 
[H Ell, i.e., g, = l for Ist-order MM and gi = 2 for 2nd- 
order MM. For Ist-order MM reactions we have 



^cat 



Xi + Cm 



A,; 



Aa„ 



|Aa„ 



dXi' 



Cm 



/AA, 



AA,: 



Xi + Cm J Xi 



I A A,, 



< 



I A A, 
A, 



C 



M 



(B3) 



da" 



At 



dr a 



From Eq. IB61 we see that as r-^O, 



lArl 



(B6) 



and 



as r—K>o, 



|Aa"'" 



n — TO 



|Ar| 



This means that if we 



constrain | Ar|/r = e/max{n, |n — to|} then |Aa;j|/a^<e 
in all cases, i.e., gi — max{n, \n — to|}. For the three 
Adair reactions in Table Hill this gives gi = {2, 1,2} for 
n — {0, 1, 2}, respectively. 
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